Abstract
Several regions show long-term changes in richness, generally increases (Figure 1, Table 2). The rest of this study focuses on understanding these long-term changes through the lens of the geographic ranges of individual species and how geographic range relates to the colonization and extinction of those species, and thus to changes in richness.
Any change in richness must be tied to an assemblage-level imbalance between the rates of colonization and extinction of individual species. Most species, if not present every year, both colonized and went extinct, possibly repeatedly (Figure S2). However, the long-term geography of these two processes shows hotspots (Figure 2, Figure S3) of changing richness in the form of spatial clustering of sites with high rates of colonization and extinction (Figure S4, Figure S5). In other words, richness changes take place at a limited number of nearby sites that are characteristically occupied by species that have just gone extinct or have just colonized the region. This result suggests that the geographic distributions of these species might be closely linked to observed changes in richness.
We analyzed geographic distribution of a species in two ways: range size (proportion of sites) and range density (proportion of tows in occupied sites). These metrics are year- and species-specific; their long-term average yields a value characteristic of the species, and averaging characteristic values across species present in a year provides a community-scale metric of geographic range. The two metrics are closely associated (Figure S6), and both predict predict species richness (Figure 3).
Why is richness predicted by geographic range? Richness is highest when the typical range of present species is small and/or sparse (Figure 3), which means that richness is higher when more rare species are present. Colonization and extinction are essential for richness to change, and basic macroecological theory states that species that are about to go extinct or that have just colonized are relatively rare. Therefore, when more rare species are present, richness will be higher because a large fraction of the assemblage has either just colonized or is about to go extinct, which means there will be more species in addition to the baseline common species (which, because they are common, probably don’t go extinct or colonize often in the time series, and so are typically present, ergo “baseline”).
The explanation for why richness is related to geographic range hinges on the notion that as a species becomes more rare it is closer to extinction, and that after a species colonizes it becomes more common. If this pattern holds, and if the intercept and slope doesn’t change much among species, then it should be true that species that are more rare on the long-term should colonize and go extinct more often. Indeed, we find that species with higher total colonization and extinction numbers do tend to be more rare over the long-term (Figure S7). Furthermore, we find that species range size and density reliably decline as extinction approaches, and increase after colonization (Figure 4).
| reg | mu | sd | mu_sd |
|---|---|---|---|
| ai | 52.7 | 2.44 | 5.31 |
| ebs | 103.6 | 5.57 | 5.31 |
| gmex | 142.3 | 4.27 | 5.31 |
| goa | 95.7 | 8.51 | 5.31 |
| neus | 122.0 | 8.24 | 5.31 |
| newf | 65.8 | 4.37 | 5.31 |
| sa | 102.4 | 4.04 | 5.31 |
| shelf | 46.3 | 3.25 | 5.31 |
| wctri | 84.5 | 7.15 | 5.31 |
richness_ts()
Figure 1. Time series of MSOM estimates of region richness. Each point is the posterior mean of regional richness in a year. Lines indicate long-term trends from fitted values of linear regression models predicting richness from time.
naive_msom_scatter()
Figure S1. MSOM richness vs naive richness
MSOM richness and Naive richness are pretty similar. MSOM richness is probably more accurate, or is at least more conservative b/c it has fewer significant trends. But their similarity should help justify using observed presences/ absences in other analyses.
Manuscript paragraph:
MSOM estimates of richness were greater than Naive estimates, but the two methods produced similar temporal dynamics in richness (Figure S1). Henceforth, we report MSOM richness estimates. The greatest long-term average richness was the Gulf of Mexico (142.3), lowest in the Scotian Shelf (46.3). The inter-region average of long-term standard deviations of richness was 5.3; the Aleutian Islands showed the lowest variability (sd = 2.4), and the Gulf of Alaska was the most variable (sd = 8.5).
Looking for trends in species richness in both the naive estimates and the MSOM estimates. Using Kendall’s Tau_b, calculated for 1E4 resamplings of the posterior of richness. Tau is also calculated using a method that removes serial correlation to achieve independence of observations so that the p-values are correct.
load("../../pkgBuild/results/rich_naive_trend_kendall.RData")
rich_naive_trend_kendall[reg!="wcann",BH:=p.adjust(taup, method='BH')]
rich_naive_trend_kendall <- rich_naive_trend_kendall[
reg!="wcann",list(reg=reg, estimate=tau, BH=BH, p.value=taup)
]
load("../../pkgBuild/results/rich_trend_kendall.RData")
rich_trend_kendall[reg!="wcann",BH:=p.adjust(pvalue, method="BH")]
rich_trend_kendall <- rich_trend_kendall[
reg!="wcann",list(reg=reg, estimate=tau, BH=BH, p.value=pvalue)
]
| reg | estimate | BH | p.value |
|---|---|---|---|
| ai | -0.061 | 0.837 | 0.837 |
| ebs | 0.501 | 0.000 | 0.000 |
| gmex | 0.853 | 0.000 | 0.000 |
| goa | 0.205 | 0.405 | 0.360 |
| neus | 0.337 | 0.010 | 0.008 |
| newf | 0.787 | 0.000 | 0.000 |
| sa | -0.569 | 0.000 | 0.000 |
| shelf | 0.705 | 0.000 | 0.000 |
| wctri | 0.764 | 0.005 | 0.003 |
In the Naive estimates, 7 regions had significant \(\tau_b\).
| reg | estimate | BH | p.value |
|---|---|---|---|
| ebs | 0.421 | 0.003 | 0.001 |
| ai | 0.232 | 0.365 | 0.325 |
| goa | 0.242 | 0.365 | 0.284 |
| wctri | 0.609 | 0.042 | 0.019 |
| gmex | 0.098 | 0.600 | 0.600 |
| sa | -0.218 | 0.212 | 0.141 |
| neus | 0.191 | 0.212 | 0.133 |
| shelf | 0.452 | 0.000 | 0.000 |
| newf | 0.727 | 0.001 | 0.000 |
In the MSOM estimates, 4 regions had significant \(\tau_{b}\).
Overall, ~half the regions show positive trends. No regions show significant negative trends (although SEUS is close, depending on the analysis). It’s a bit surprising how much removing the autocorrelation seemed to impact some of the trends (AI in particular, I think).
Manuscript paragraph:
Estimated slopes for long-term trends (Kendall’s \(\tau_b\)) in richness were positive for most regions. For Naive estimates, all the seven positive \(\tau\) were also significantly different from 0, whereas the two negative \(\tau_b\), Aleutian Islands and Southeast US, were not (Table S2). Long-term trends in MSOM estimates of richness had the same sign as Naive trends, except for Aleutian Islands, but now the only significant \(\tau_b\) were the following four regions: Eastern Bering Sea (\(\tau_b\) = 0.41), West Coast US (\(\tau_b\) = 0.61), Scotian Shelf (\(\tau_b\) = 0.45), and Newfoundland (\(\tau_b\) = 0.72) (Table 1). Richness trends were not significant in the Gulf of Mexico, Gulf of Alaska, and Northeast US, Aleutian Islands, Southeast US.
categ_barplot()
Figure S2. Number of species beloning to the categories of both, neither, colonizer, leaver in each region
| ai | ebs | gmex | goa | neus | newf | sa | shelf | wctri | |
|---|---|---|---|---|---|---|---|---|---|
| neither | 39 | 64 | 105 | 63 | 56 | 49 | 69 | 27 | 64 |
| both | 6 | 44 | 31 | 17 | 83 | 12 | 35 | 20 | 15 |
| colonizer | 9 | 2 | 8 | 18 | 1 | 10 | 0 | 1 | 12 |
| leaver | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 |
It’s the same pattern, whichever way you split it. However, AI is the only region that had more colonizers than both species. An interesting way to think about some of this is that the average sd in richness was 5.314, so when the number of colonizer or leaver species exceed’s that region’s sd, the impact of those categories, which I consider to be dubious, might start being relevant (though it’s not necessarily problematic, nor is this even close to an actual test for the significance of those categories to the trend). EBS and Shelf had significant positive trends in richness and very low numbers in the colonizer category. WCTRI and NEWF had similar numbers in the both and colonizer category.
% Table created by stargazer v.5.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu % Date and time: Tue, Dec 06, 2016 - 15:35:23col_ext_ts()
Figure NotIncluded. Number of colonizations (blue) and extinctions (red) over time in each region.
In most regions the differences in colonization and extinction numbers are similar over time. The most obvious exceptions are for the 3 regions that showed large initial spikes in richness; the GOA, GMEX, and AI regions initially have much larger numbers of colonizers than leavers, but this number shrinks rapidly until the two rates are ~equal.
For the regions with significant positive slopes, there is no visually obvious increase in colonizations relative to extinctions over time. Because the colonization and extinction numbers tend to track each other over the long-term, it it would be difficult to attribute the long-term changes in richness to a change in just colonization or extinction rates.
Manuscript paragraph:
A time series of richness can be decomposed into the colonizations and extinctions of individual species over time. We categorized species according to the following colonization extinction patterns: present in all years = neither (536 species), colonized and went extinct = both (263 species), initially absent but present every year after its colonization = colonizer (61 species), initially present but absent every year after its extinction = leaver (4 species). Most regions had the same overall ranking (neither > both > colonizer > leaver), except in the Northeast US where both was the most common and neither second, and in the Aleutian Islands where colonizer was the second most common and both third (Figure S2). In general, changes in richness were not due to permanent departures or introductions of species to the region. Furthermore, colonization and extinction rates did not become more dissimilar over time for any region (Figure NotIncluded). Colonizations were initially greater than extinctions in Aleutian Islands, Gulf of Alaska, and Gulf of Mexico, but this difference disappeared in the latter portion of these time series, as evidenced by these regions’ initially rapid increase in richness that later plateaued. The other regions did not show strong trends in the difference between colonizations and extinctions over time, making it difficult to attribute the long-term trends in richness to changes in just colonization or just extinction rates.
ceRate_map(ce="colonization")
Figure 2. Maps of long-term averages of colonizations per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of colonization rate were smoothed using a Gaussian kernel smoother. The smoothed colonization rate is indicated by the color bars in each panel; colors are scaled independently for each region.
ceRate_map(ce="extinction")
Figure S3. Maps of long-term averages of extinctions per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of extinction rate were smoothed using a Gaussian kernel smoother. The smoothed extinction rate is indicated by the color bars in each panel; colors are scaled independently for each region.
Hotspots can be seen in most regions. Newfoundland also has high values around its edge (as opposed to interior), it seems. NEUS and Gmex show very strong hotspots, and other locations tend to be much much lower. Other regions show more of a continuum.
rel_col_ext_rate <- mapDat[,j={
map_smooth_col <- spatstat::Smooth(spatstat::ppp(
x=.SD[,lon], y=.SD[,lat],
marks=.SD[,n_spp_col_weighted], window=mapOwin[[reg]]
), hmax=1)
mark_range_col <- range(map_smooth_col, na.rm=TRUE)*10
map_smooth_ext <- spatstat::Smooth(spatstat::ppp(
x=.SD[,lon], y=.SD[,lat],
marks=.SD[,n_spp_ext_weighted], window=mapOwin[[reg]]
), hmax=1)
mark_range_ext <- range(map_smooth_ext, na.rm=TRUE)*10
ol <- list(
minval_col=mark_range_col[1], maxval_col=mark_range_col[2],
max_o_min_col=do.call("/",as.list(rev(mark_range_col))),
minval_ext=mark_range_ext[1], maxval_ext=mark_range_ext[2],
max_o_min_ext=do.call("/",as.list(rev(mark_range_ext)))
)
lapply(ol, function(x)if(is.numeric(x)){signif(x,3)}else{x})
},by=c("reg")]
| reg | minval_col | maxval_col | max_o_min_col | minval_ext | maxval_ext | max_o_min_ext |
|---|---|---|---|---|---|---|
| ebs | 0.116 | 0.554 | 4.79 | 0.120 | 0.546 | 4.56e+00 |
| ai | 0.057 | 0.483 | 8.53 | 0.000 | 0.174 | 1.37e+09 |
| goa | 0.001 | 0.764 | 652.00 | 0.018 | 0.567 | 3.24e+01 |
| wctri | 0.023 | 1.160 | 50.10 | 0.009 | 0.951 | 1.03e+02 |
| gmex | 0.508 | 3.180 | 6.25 | 0.073 | 3.280 | 4.51e+01 |
| sa | 1.080 | 2.180 | 2.03 | 0.758 | 3.690 | 4.87e+00 |
| neus | 0.106 | 22.500 | 213.00 | 0.087 | 21.100 | 2.42e+02 |
| shelf | 0.000 | 1.220 | 25400.00 | 0.000 | 1.050 | 1.82e+04 |
| newf | 0.036 | 0.582 | 16.40 | 0.011 | 0.612 | 5.71e+01 |
| MEDIAN | 0.057 | 1.160 | 16.40 | 0.018 | 0.951 | 5.71e+01 |
nb_moranI(ce="colonization")
Figure S4. Connectivity and local spatial autocorrelation of colonization events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.
nb_moranI(ce="extinction")
Figure S5. Connectivity and local spatial autocorrelation of extinction events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.
Looking for spatial footprint that will predict richness. Here I characterize the geographic distribution of a species as its range size. This is tricky for richness because richness is a community level trait, but I’ve characterized range size at the species level. In fact, this “species level” attribute is also potentially dyanmic – a species need not have fixed range size. So there are two levels of averaging – for each species take its long-term average value, then for each community take the average across species.
Note that I’d previously used range density in addition to range size. I’ve dropped the density metric in most places in order to simplify the results, because the range size results were closely related to the range density results.
First posterity, I’ll show a figure relating range size and range density. Then I’ll present the figure of how size can ‘predict’ richness. Finally, I’ll explore models more formally expressing the relationship between richness and size.
rangeSizeDens()
Figure S6. Range density versus range size. In panel A each point is a species-region-year combination. In panel B, each point is a region-year. Range size is the proportion of sites occupied, range density the tows in occupied sites. The community metrics in B is calculated by take each species’ long-term average from A, then taking the average across all species present in the community in a given year. Fitted lines in A are from a loess fit.
These plots show that there is a strong relationship between range size and range density. Interestingly, in Figure S6B, the cross-region relationship is negative (if each color had 1 pt), whereas the within-region relationship is positive.
The positive relationship between size and density is not surprising. My interpretation of density is ~population size. Often population size is correlated with range size. I think this is a standard result, but I need to double-check.
rich_geoRange("size", leg=TRUE, legPan=1, panLab=FALSE)
Figure 3. Species richness vs geographic range size. Range size is presented as each species’ long-term average of the proportion of sites it occupied. Solid lines are linear regressions with MSOM richness as the response and the horizontal axis and an intercept as the predictors.
Range size is a pretty good predictor of species richness. I think I had originally missed the range size relationship b/c I hadn’t done the same aggregating procedure. The interpretation I have is that richness is highest when you have a bunch of rare species.
The goal here is to see if species richness is predicted by the typical range size of community’s constituent species. First I’ll run different types of models just to explore whether this is true, in general (across regions). Then I’ll drill in to each region individually to answer the same question.
# This is a function that'll help summarize model fit and coeffs and parameter significance:
mod_smry <- function(m, pred_name=c("density","size","time","type","time:type")){
pred_name <- match.arg(pred_name)
sc <- sem.coefs(m)
sc[,c("estimate","p.value")] <- lapply(sc[,c("estimate","p.value")], signif, 4)
mod_call <- switch(class(m), lmerMod=m@call, lm=m$call)
mod_call <- as.character(mod_call)[2]
fits <- sem.model.fits(m)
fits[,c("estimate","p.value","Marginal","Conditional")] <- lapply(fits[,c("Marginal","Conditional")], signif, 4)
out <- cbind(
mod_call = mod_call,
sc[sc[,"predictor"]==pred_name,],
fits,
AIC=round(AIC(m), getOption("digits"))
)
out[,c(
# "std.error","N",
"Class","mod_call","predictor","estimate","p.value","Marginal","Conditional","AIC"
)]
}
# Make a data set that is useful for these regressions (short variable names, etc):
range_reg <- comm_master[,list(
reg, year, rich=reg_rich, density=propTow_occ_avg, size=propStrata_avg_ltAvg
)]
# Fit different models to the whole data set
rSize_mods <- list()
rSize_mods[[1]] <- lm(rich ~ size, data=range_reg)
rSize_mods[[2]] <- lm(rich ~ size*reg, data=range_reg)
rSize_mods[[3]] <- lme4::lmer(rich ~ size + (1|reg), data=range_reg)
rSize_mods[[4]] <- lme4::lmer(rich ~ size + (size|reg), data=range_reg)
rich_size_smry <- rbindlist(lapply(rSize_mods, mod_smry, pred_name="size"))
# Fit same model to each region separately
rSize_reg_mods <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
rSize_reg_mods[[r]] <- lm(rich ~ size, data=range_reg[reg==ur[r]])
}
rich_s_reg_smry <- data.table(
reg=ur,
rbind(
rbindlist(lapply(rSize_reg_mods, mod_smry, pred_name="size"))
)
)
setkey(rich_s_reg_smry, reg, Class, predictor)
setnames(rich_s_reg_smry, old=c("Marginal","Conditional","p.value"), new=c("MargR2","CondR2","pval"))
setkey(rich_s_reg_smry, reg, Class, mod_call, predictor)
rich_s_reg_smry[, BH:=round(p.adjust(pval, "BH"), 3), by=c("mod_call", "predictor")]
rich_s_reg_smry <- dcast(rich_s_reg_smry, reg+Class+mod_call+MargR2+CondR2+AIC~predictor, value.var=c("pval", "BH"))
| Class | mod_call | predictor | estimate | p.value | Marginal | Conditional | AIC |
|---|---|---|---|---|---|---|---|
| lm | rich ~ size | size | -41.7 | 0.124 | 0.012 | NA | 1926 |
| lm | rich ~ size * reg | size | -298.7 | 0.006 | 0.995 | NA | 927 |
| lmerMod | rich ~ size + (1 | reg) | size | -320.5 | 0.000 | 0.293 | 0.995 | 1133 |
| lmerMod | rich ~ size + (size | reg) | size | -446.7 | 0.000 | 0.332 | 0.999 | 1005 |
| reg | Class | mod_call | MargR2 | CondR2 | AIC | pval_size | BH_size |
|---|---|---|---|---|---|---|---|
| ai | lm | rich ~ size | 0.680 | NA | 46.8 | 0.001 | 0.001 |
| ebs | lm | rich ~ size | 0.863 | NA | 137.9 | 0.000 | 0.000 |
| gmex | lm | rich ~ size | 0.767 | NA | 77.8 | 0.000 | 0.000 |
| goa | lm | rich ~ size | 0.802 | NA | 76.5 | 0.000 | 0.000 |
| neus | lm | rich ~ size | 0.852 | NA | 169.6 | 0.000 | 0.000 |
| newf | lm | rich ~ size | 0.868 | NA | 65.1 | 0.000 | 0.000 |
| sa | lm | rich ~ size | 0.361 | NA | 134.5 | 0.002 | 0.002 |
| shelf | lm | rich ~ size | 0.898 | NA | 124.3 | 0.000 | 0.000 |
| wctri | lm | rich ~ size | 0.957 | NA | 41.1 | 0.000 | 0.000 |
| mod_call | reg | Class | MargR2 | CondR2 | AIC | pval_size | BH_size |
|---|---|---|---|---|---|---|---|
| rich ~ size | NA | NA | 0.783 | NA | 97.1 | 0 | 0 |
All models are pretty good predictors. Well, the most basic model kinda sucks I guess. It needs to account for some of the between-region variation.
As far as picking one or the other, it doesn’t end up mattering much. Range size is a lot better than density in NEUS, and density outperforms size in AI. Otherwise, size as a slight edge over density on average, although both predictors are significant in all regions.
The two metrics of geographic range are well correlated. Furthermore, richness can be predicted pretty well using regressions with either as a predictor. There are large differences among regions, though. This is probably because richness is not readily comparable among most regions. Regions vary mostly in their intercept values, and they have fairly similar slopes (though they are not identical, and model fits improve when allowing slopes to vary among regions; it’s just that the improvement is small compared to allowing intercepts to vary among regions).
The interpretation of the result that geographic distribution predicts species richness is likely associated with species rarity. When the average range density or range size of a community is low, it means it has a lot of species that are rare (at either spatial scale). It’s these rare species that come and go, and form the dynamics of richness that we observe. When that dynamical value is high, it implies that an above-average number of the dynamic species are present. Because those species are transient (dynamic), they are also probably rare.
The result that richness is predicted by geographic range implied an underlying association between range, colonization/ extinction, and richness itself. Above, richness was explained with range. Later, the results will explain how range changes near a colonization/ extinction event. Here, between the two, the results show how the number of colonizations is related to range.
ceEventRange("mean_size")
Figure S7. Number of colonizations and extinctions as a function of range size and range density.
Yup, this is definitely a thing. Long-term average range size and range density predict how many colonizations and extinctions a species is likely to have. This will lead nicely into examing how range changes prior to an extinction or after a colonization.
rangeSize_absenceTime("rangeSize")
Figure 4. Geographic range size vs years until extinction (A) and years after colonization (B). For visualization purposes, range size is averaged across species for each unique value on each axis, and a linear model fit through this average. Statistics in main text use unaggregated data. The horizontal axes were formulated as time until (since) the nearest upcoming (previous) absence. Because range size must be zero when either horizontal axis has a value of zero, points at (0,0) were excluded from figures and analyses.
rangeSize_absenceTime("rangeDensity")
Figure 4b. Geographic range density vs years until extinction (A) and years after colonization (B). For visualization purposes, range density is averaged across species for each unique value on each axis, and a linear model fit through this average. Statistics in main text use unaggregated data. The horizontal axes were formulated as time until (since) the nearest upcoming (previous) absence. Because range density must be zero when either horizontal axis has a value of zero, points at (0,0) were excluded from figures and analyses.
Range size declines near an absence much more consistently than does range density; both are (relatively) low just before extinction and just after colonization. However, range density has much more variable intercepts among regions, whereas range size does not.
This makes sense, at least somewhat, because colonization and extinction events are defined at the site level; though the outcome isn’t necessitated by this formulation, because size could drop suddenly. In fact, when a species is absent, both its range size and its range density must be 0 (though, range density is technically calculated for only those sites that are occupied, so I supposed it’s technically undefined according to the equations I’m using).
I think the regressions for range size should omit an intercept, while the regressions for range density should have it. This might be hard to justify fully a priori (though see my thinking in previous paragraph), so I’ll probably just do a model selection and maybe discuess the difference if one model has an intercept and the other does not.
# handy data set for regressions
rangeTimeDT <- spp_master[!is.na(stretch_type) & propStrata!=0]
rangeTimeDT <- rangeTimeDT[,list(
reg=reg,
event=as.character(event_year),
spp=spp,
type=as.character(stretch_type),
time=ext_dist,
size=propStrata,
density=propTow_occ
)]
# summary function used in tables
mod_smry2 <- function(m){
mod_call <- switch(class(m), lmerMod=m@call, lm=m$call)
mod_call <- as.character(mod_call)[2]
fits <- sem.model.fits(m)
fits[,c("estimate","p.value","Marginal","Conditional")] <- lapply(fits[,c("Marginal","Conditional")], signif, 4)
anova_sum <- car::Anova(m)
out <- data.frame(
mod_call = mod_call,
data.frame(predictor=rownames(anova_sum), anova_sum)[,c("predictor","Pr..Chisq.")],
fits,
AIC=AIC(m)#signif(AIC(m), getOption("digits"))
)
out[,c("Class","mod_call", "predictor", "Pr..Chisq.","Marginal","Conditional","AIC")]
}
# models for range size
sizeCE_mods <- list()
sizeCE_mods[[1]] <- lme4::lmer(size ~ time + (time|spp/reg), data=rangeTimeDT)
# sizeCE_mods[[2]] <- lme4::lmer(size ~ time + (time|spp/reg) + (1|type/reg), data=rangeTimeDT) # error
sizeCE_mods[[2]] <- lme4::lmer(size ~ time*type + (time|spp/reg), data=rangeTimeDT)
# lmerTest::step(sizeCE_mods[[2]]) # doing most complicated throws an error
| Dependent variable: | ||
| size | ||
| (1) | (2) | |
| time | 0.007*** | 0.007*** |
| typepre_ext | 0.0004 | |
| time:typepre_ext | 0.0002 | |
| Constant | 0.062*** | 0.061*** |
| Observations | 5,726 | 5,726 |
| Log Likelihood | 7,292.000 | 7,281.000 |
| Akaike Inf. Crit. | -14,566.000 | -14,539.000 |
| Bayesian Inf. Crit. | -14,506.000 | -14,466.000 |
| Note: | p<0.1; p<0.05; p<0.01 | |
| Class | mod_call | predictor | pval | MargR2 | CondR2 | AIC |
|---|---|---|---|---|---|---|
| lmerMod | size ~ time + (time | spp/reg) | time | 0.000 | 0.124 | 0.84 | -14566 |
| lmerMod | size ~ time * type + (time | spp/reg) | time | 0.000 | 0.123 | 0.84 | -14539 |
| lmerMod | size ~ time * type + (time | spp/reg) | type | 0.570 | 0.123 | 0.84 | -14539 |
| lmerMod | size ~ time * type + (time | spp/reg) | time:type | 0.723 | 0.123 | 0.84 | -14539 |
Range size seems to be important to include type as a fixed effect. It does not need an interaction between time*type. I did additional testing beyond what’s presenting here, and I can confirm that having spp as a random factor is useful, too.
getCoefs <- function(mList){
outList <- list()
for(r in 1:length(ur)){
outList[[ur[r]]] <- rbindlist(lapply(
coef(mList[[r]]), function(x)as.list(colMeans(x))
), idcol=TRUE)
setnames(outList[[ur[r]]], c(".id"), c("randomGroup"))
mod_call <- switch(class(mList[[r]]), lmerMod=mList[[r]]@call, lm=mList[[r]]$call)
mod_call <- as.character(mod_call)[2]
outList[[ur[r]]][,mod_call:=mod_call]
}
outList <- rbindlist(outList, idcol=TRUE)
setnames(outList, c(".id"), c("reg"))
# setcolorder(outList, c("reg", "mod_call", "randomGroup", "(Intercept)", "time", "typepre_ext"))
outList
}
sTime_reg_mods3 <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
sTime_reg_mods3[[r]] <- lme4::lmer(size ~ time + (time|spp), data=rangeTimeDT[reg==ur[r]])
}
sTime_reg_smry3 <- data.table(rbind(
rbindlist(structure(lapply(sTime_reg_mods3, mod_smry2), .Names=ur), idcol=TRUE)
))
setnames(sTime_reg_smry3, old=c(".id","Pr..Chisq.","Marginal","Conditional"), new=c('reg',"pval","MargR2","CondR2"))
setkey(sTime_reg_smry3, reg, Class, mod_call, predictor)
sTime_reg_smry3[, BH:=round(p.adjust(pval, "BH"), 3), by=c("mod_call", "predictor")]# adjust p-values for multiple tests
sTime_reg_smry3 <- dcast(sTime_reg_smry3, reg+Class+mod_call+MargR2+CondR2+AIC~predictor, value.var=c("pval", "BH"))# rearrange so each model on 1 line
modCoef3 <- getCoefs(sTime_reg_mods3)
sTime_reg_smry3 <- merge(sTime_reg_smry3, modCoef3, by=c("reg","mod_call"))
setnames(sTime_reg_smry3, old=c("(Intercept)"), new=c("Int"))
| reg | MargR2 | CondR2 | AIC | pval_time | BH_time | Int | time |
|---|---|---|---|---|---|---|---|
| ai | 0.089 | 0.708 | -168 | 0.002 | 0.002 | 0.135 | 0.007 |
| ebs | 0.070 | 0.868 | -3334 | 0.000 | 0.000 | 0.058 | 0.006 |
| gmex | 0.099 | 0.705 | -1021 | 0.000 | 0.000 | 0.091 | 0.012 |
| goa | 0.084 | 0.683 | -799 | 0.000 | 0.000 | 0.076 | 0.005 |
| neus | 0.129 | 0.734 | -7726 | 0.000 | 0.000 | 0.029 | 0.003 |
| newf | 0.084 | 0.868 | -565 | 0.015 | 0.015 | 0.064 | 0.011 |
| sa | 0.198 | 0.608 | -1195 | 0.000 | 0.000 | 0.092 | 0.013 |
| shelf | 0.226 | 0.670 | -1954 | 0.000 | 0.000 | 0.063 | 0.005 |
| wctri | 0.184 | 0.907 | -288 | 0.000 | 0.000 | 0.022 | 0.016 |
# Fit same model to each region separately
sTime_reg_mods <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
sTime_reg_mods[[r]] <- lme4::lmer(size ~ time + type + (time|spp), data=rangeTimeDT[reg==ur[r]])
}
sTime_reg_smry <- data.table(rbind(
rbindlist(structure(lapply(sTime_reg_mods, mod_smry2), .Names=ur), idcol=TRUE)
))
setnames(sTime_reg_smry, old=c(".id","Pr..Chisq.","Marginal","Conditional"), new=c('reg',"pval","MargR2","CondR2"))
setkey(sTime_reg_smry, reg, Class, mod_call, predictor)
sTime_reg_smry[, BH:=round(p.adjust(pval, "BH"), 3), by=c("mod_call", "predictor")]# adjust p-values for multiple tests
sTime_reg_smry <- dcast(sTime_reg_smry, reg+Class+mod_call+MargR2+CondR2+AIC~predictor, value.var=c("pval", "BH"))# rearrange so each model on 1 line
# add coefficients to the sTime_reg_smry data.table
modCoef <- getCoefs(sTime_reg_mods)
sTime_reg_smry <- merge(sTime_reg_smry, modCoef, by=c("reg","mod_call"))
setnames(sTime_reg_smry, old=c("(Intercept)"), new=c("Int"))
| reg | MargR2 | CondR2 | AIC | pval_time | pval_type | BH_time | BH_type | Int | time | typepre_ext |
|---|---|---|---|---|---|---|---|---|---|---|
| ai | 0.090 | 0.703 | -161 | 0.002 | 0.559 | 0.002 | 0.718 | 0.129 | 0.007 | 0.021 |
| ebs | 0.070 | 0.868 | -3323 | 0.000 | 0.669 | 0.000 | 0.753 | 0.058 | 0.006 | 0.001 |
| gmex | 0.099 | 0.710 | -1014 | 0.000 | 0.141 | 0.000 | 0.594 | 0.086 | 0.012 | 0.017 |
| goa | 0.087 | 0.684 | -791 | 0.000 | 0.463 | 0.000 | 0.718 | 0.078 | 0.005 | -0.011 |
| neus | 0.132 | 0.735 | -7714 | 0.000 | 0.264 | 0.000 | 0.594 | 0.028 | 0.003 | 0.002 |
| newf | 0.081 | 0.865 | -556 | 0.015 | 0.507 | 0.015 | 0.718 | 0.067 | 0.010 | -0.008 |
| sa | 0.193 | 0.604 | -1186 | 0.000 | 0.220 | 0.000 | 0.594 | 0.088 | 0.013 | 0.010 |
| shelf | 0.232 | 0.677 | -1957 | 0.000 | 0.000 | 0.000 | 0.002 | 0.070 | 0.004 | -0.021 |
| wctri | 0.182 | 0.907 | -280 | 0.000 | 0.801 | 0.000 | 0.801 | 0.021 | 0.016 | 0.005 |
sTime_reg_mods2 <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
sTime_reg_mods2[[r]] <- lme4::lmer(size ~ time * type + (time|spp), data=rangeTimeDT[reg==ur[r]])
}
sTime_reg_smry2 <- data.table(rbind(
rbindlist(structure(lapply(sTime_reg_mods2, mod_smry2), .Names=ur), idcol=TRUE)
))
setnames(sTime_reg_smry2, old=c(".id","Pr..Chisq.","Marginal","Conditional"), new=c('reg',"pval","MargR2","CondR2"))
setkey(sTime_reg_smry2, reg, Class, mod_call, predictor)
sTime_reg_smry2[, BH:=round(p.adjust(pval, "BH"), 3), by=c("mod_call", "predictor")]# adjust p-values for multiple tests
sTime_reg_smry2 <- dcast(sTime_reg_smry2, reg+Class+mod_call+MargR2+CondR2+AIC~predictor, value.var=c("pval", "BH"))# rearrange so each model on 1 line
modCoef2 <- getCoefs(sTime_reg_mods2)
sTime_reg_smry2 <- merge(sTime_reg_smry2, modCoef2, by=c("reg","mod_call"))
setnames(sTime_reg_smry2, old=c("(Intercept)"), new=c("Int"))
| reg | MargR2 | CondR2 | AIC | pval_time | pval_time:type | pval_type | BH_time | BH_time:type | BH_type | Int | time | typepre_ext | time:typepre_ext |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ai | 0.117 | 0.736 | -154 | 0.000 | 0.028 | 0.526 | 0.000 | 0.085 | 0.676 | 0.120 | 0.008 | 0.094 | -0.013 |
| ebs | 0.069 | 0.868 | -3309 | 0.000 | 0.574 | 0.669 | 0.000 | 0.738 | 0.753 | 0.058 | 0.005 | -0.001 | 0.001 |
| gmex | 0.099 | 0.714 | -1003 | 0.000 | 0.675 | 0.143 | 0.000 | 0.759 | 0.608 | 0.087 | 0.011 | 0.012 | 0.002 |
| goa | 0.083 | 0.678 | -783 | 0.000 | 0.060 | 0.463 | 0.000 | 0.108 | 0.676 | 0.081 | 0.004 | -0.048 | 0.008 |
| neus | 0.136 | 0.731 | -7713 | 0.000 | 0.000 | 0.270 | 0.000 | 0.001 | 0.608 | 0.026 | 0.004 | 0.007 | -0.002 |
| newf | 0.083 | 0.864 | -547 | 0.014 | 0.491 | 0.505 | 0.014 | 0.736 | 0.676 | 0.065 | 0.011 | 0.002 | -0.006 |
| sa | 0.190 | 0.602 | -1175 | 0.000 | 0.789 | 0.221 | 0.000 | 0.789 | 0.608 | 0.089 | 0.013 | 0.008 | 0.001 |
| shelf | 0.220 | 0.672 | -1950 | 0.000 | 0.015 | 0.000 | 0.000 | 0.067 | 0.002 | 0.072 | 0.004 | -0.037 | 0.004 |
| wctri | 0.228 | 0.920 | -275 | 0.000 | 0.040 | 0.776 | 0.000 | 0.090 | 0.776 | 0.028 | 0.014 | -0.069 | 0.021 |
| reg | MargR2 | CondR2 | AIC | mod |
|---|---|---|---|---|
| ai | 0.089 | 0.708 | -168 | size ~ time + (time | spp) |
| ai | 0.090 | 0.703 | -161 | size ~ time + type + (time | spp) |
| ai | 0.117 | 0.736 | -154 | size ~ time * type + (time | spp) |
| ebs | 0.070 | 0.868 | -3334 | size ~ time + (time | spp) |
| ebs | 0.070 | 0.868 | -3323 | size ~ time + type + (time | spp) |
| ebs | 0.069 | 0.868 | -3309 | size ~ time * type + (time | spp) |
| gmex | 0.099 | 0.705 | -1021 | size ~ time + (time | spp) |
| gmex | 0.099 | 0.710 | -1014 | size ~ time + type + (time | spp) |
| gmex | 0.099 | 0.714 | -1003 | size ~ time * type + (time | spp) |
| goa | 0.084 | 0.683 | -799 | size ~ time + (time | spp) |
| goa | 0.087 | 0.684 | -791 | size ~ time + type + (time | spp) |
| goa | 0.083 | 0.678 | -783 | size ~ time * type + (time | spp) |
| neus | 0.129 | 0.734 | -7726 | size ~ time + (time | spp) |
| neus | 0.132 | 0.735 | -7714 | size ~ time + type + (time | spp) |
| neus | 0.136 | 0.731 | -7713 | size ~ time * type + (time | spp) |
| newf | 0.084 | 0.868 | -565 | size ~ time + (time | spp) |
| newf | 0.081 | 0.865 | -556 | size ~ time + type + (time | spp) |
| newf | 0.083 | 0.864 | -547 | size ~ time * type + (time | spp) |
| sa | 0.198 | 0.608 | -1195 | size ~ time + (time | spp) |
| sa | 0.193 | 0.604 | -1186 | size ~ time + type + (time | spp) |
| sa | 0.190 | 0.602 | -1175 | size ~ time * type + (time | spp) |
| shelf | 0.226 | 0.670 | -1954 | size ~ time + (time | spp) |
| shelf | 0.232 | 0.677 | -1957 | size ~ time + type + (time | spp) |
| shelf | 0.220 | 0.672 | -1950 | size ~ time * type + (time | spp) |
| wctri | 0.184 | 0.907 | -288 | size ~ time + (time | spp) |
| wctri | 0.182 | 0.907 | -280 | size ~ time + type + (time | spp) |
| wctri | 0.228 | 0.920 | -275 | size ~ time * type + (time | spp) |
As stated with the larger (pooled regional data) regressions, range size responds more consistently/ strongly to an approaching extinction/ departure from colonization than does range density. Range size shrinks as extinction approaches, increases as time since colonization increases. In both cases, there was only 1 region for which the direction (into extinction, out of colonization) mattered: for range density, it was the Southeast US (effect of pre-extinction direction = 0.037, p=0.043 [BH correction], Table 12), and for range size it was Scotial Shelf (effect = -0.021, p = 0.002). Time was a significant predictor of range size in all regions; time was not a significant predictor of range density in Newfoundland (p = 0.846 [BH]), and the Scotial Shelf (p = 0.916).
These results support the hypothesis that species rarity is closely associated with proximity to extinction/ colonization, and therefore the probability that the species will contribute to a change in species richness. Furthermore, these relationships have not been directly quantified for entire assemblages. Competing hypotheses exist regarding how range should change approaching extinction and after colonization. We find that the change in range is similar regardless of direction. However, spatial scale was important. Although slopes were similar, the intercept for density was far larger than for range size — the fraction of tows containing a species in occupied sites may remain high even when extinction is imminent, and may similar be large even if colonization was recent. Range size, on the other hand, was much closer to 0 near an absence.
Overall, the results suggest that the spatial footprint of individual species is important for understanding changes in species richness. Furthermore, because species contributing most to the dynamics of richness are those that repeatedly colonize and go extinct, it is meaningful to look at a species’ long-term rarity in order to gauge whether it is likely to contribute to those long-term richness changes. Determining what drives the geographic range of individual species is probably a powerful way to anticipate richness changes.
rangeTimeDT[,nTime:=length(time),by=c("reg","type","event","spp")]
getEPR <- function(x){
col_select <- c("Estimate","Pr(>|t|)")
sx <- summary(x)
EP <- sx$coefficients[2,col_select]
names(EP) <- c("Estimate","Pr")
R <- sx$r.squared
data.table(as.data.table(as.list(EP)), Rsquared=R)
}
o <- rangeTimeDT[nTime>=3,j={
getEPR(lm(size~time))
} ,by=c("reg","type","event","nTime","spp")
]
par(mfcol=c(3,2), mar=c(2,2,0.5,0.5), cex=1, ps=10, mgp=c(1,0.1,0), tcl=-0.1)
hist(o[,Estimate])
hist(o[,(Pr)])
hist(o[,(Rsquared)])
setorder(o, nTime)
o[,j={plot(nTime,Pr); lines(spline(Pr~nTime),col='red',lwd=2)}]
NULL
o[,j={plot(nTime,Estimate); ; lines(spline(Estimate~nTime),col='red',lwd=2)}]
NULL
kable(o[,list(propSignificant=(sum(Pr<0.05)/sum(!is.na(Pr))),n=sum(!is.na(Pr))),by=c("reg","type")])
| reg | type | propSignificant | n |
|---|---|---|---|
| ai | post_col | 0.467 | 15 |
| ebs | post_col | 0.373 | 51 |
| ebs | pre_ext | 0.270 | 37 |
| gmex | post_col | 0.297 | 37 |
| gmex | pre_ext | 0.083 | 12 |
| goa | post_col | 0.343 | 35 |
| neus | pre_ext | 0.071 | 70 |
| neus | post_col | 0.147 | 102 |
| newf | post_col | 0.263 | 19 |
| newf | pre_ext | 0.167 | 6 |
| sa | pre_ext | 0.189 | 37 |
| sa | post_col | 0.114 | 35 |
| shelf | pre_ext | 0.200 | 15 |
| shelf | post_col | 0.346 | 26 |
| wctri | post_col | 0.217 | 23 |
| wctri | pre_ext | 0.500 | 2 |
| ai | pre_ext | 0.000 | 2 |
| goa | pre_ext | 0.250 | 4 |
kable(o[,list(propSignificant=(sum(Pr<0.05)/sum(!is.na(Pr))),n=sum(!is.na(Pr))),by=c("reg","type")][,mean(propSignificant),by=c("type")])
| type | V1 |
|---|---|
| post_col | 0.285 |
| pre_ext | 0.192 |
Exploration Figure histograms of separate regressions of size ~ time; this is for each run-up to an extinction and reach follow-up to a colonization. Trying to understand how regularly the pattern might be observed. Hard to answer because adjusting the restriction on number of events in the run-up or follow-up (nTime) greatly affects the proportion that are significant.
The mixed effect models show a strong tendency for range size to be smaller near an absence. I figured that this relationship wouldn’t be apparent for all cases, which ended up being the result. However, the significance of these relationships depends on the number of years for each event stretch. So sample size matters. On one hand, this could hold implications for the rate of decline, so excluding short stretches might also exclude more cases with abrupt declines/ increases (although, these may well be preceded by long durations of stability, so not all abrupt declines/ increases would be excluded necessarily). However, the slope estimate doesnt change a whole lot across sample size: from 0 to ~10 years, increasing sample size (duration) also increases the slope (going from slightly negative to pretty consistently positive). Then it levels out, and doesn’t continue increasing. So it is not the case that longer stretches are long because they have more gradual slopes, necessarily. Actually, in the range of data for which there does appear to be a trend (0-10 samples), the slope becomes larger which is the opposite of “longer stretches result from more gradual processes” notion (thinking in terms of extinction).
Ultimately, I think the mixed effects model is far more appropriate so long as conclusions are cauched in general patterns, not in being able to detect the pattern for any single species. While it could have been interesting to present the many-regression results too, I think it is a separate analysis to consider how often this rule would apply to individual species. Such an analysis would benefit from understanding when the rule does and does not apply, not just the apparent frequency of relevance.
load("../manuscript/manuscript_binSize.RData")
par(mfrow=c(3,3), mar=c(3,3,0.5,3), oma=c(1,1,1,0.5))
for(r in 1:length(regs)){
image(bin_sites[[r]], axes=FALSE, col=fields::tim.colors())
mtext(regs[r], side=3, line=0, font=2)
xl <- as.numeric(rownames(bin_sites[[r]]))
yl <- as.numeric(colnames(bin_sites[[r]]))
axis(1, at=seq(0,1, length.out=length(xl)), labels=xl)
axis(2, at=seq(0,1, length.out=length(yl)), labels=yl)
if(r==length(regs)){
mtext("Longitude-latitude bin size (degrees)", side=1, line=0, outer=TRUE)
mtext("Depth bin size (meters)", side=2, line=-0.5, outer=TRUE)
}
fields::image.plot(bin_sites[[r]], axes=FALSE, legend.only=TRUE, graphics.reset=TRUE)
}
Bin Size How Bin size (lon, lat, depth) affects the number of sites in a region. The vertical axis is the size of the depth bin in meters, horizontal axis is the size of the lon-lat bin in degrees, and the colors indicate the number of sites in that region that are sampled for at least 85% of years.
scaled_bin_sites <- lapply(bin_sites, function(x){x2 <- x; x2[,] <- scale(c(x)); x2})
mean_sbs <- apply(simplify2array(scaled_bin_sites), c(1,2), mean, na.rm=TRUE)
min_sbs <- apply(simplify2array(scaled_bin_sites), c(1,2), min, na.rm=TRUE)
par(mfrow=c(2,1), mar=c(4,4,1,5))
image(mean_sbs, axes=FALSE, col=fields::tim.colors())
mtext("cross-region average of z-scores", side=3, line=0, font=2)
xl <- as.numeric(rownames(bin_sites[[1]]))
yl <- as.numeric(colnames(bin_sites[[1]]))
axis(1, at=seq(0,1, length.out=length(xl)), labels=xl)
axis(2, at=seq(0,1, length.out=length(yl)), labels=yl)
mtext("Longitude-latitude bin size (degrees)", side=1, line=2)
mtext("Depth bin size (meters)", side=2, line=2)
fields::image.plot(mean_sbs, axes=FALSE, legend.only=TRUE, graphics.reset=TRUE)
image(min_sbs, axes=FALSE, col=fields::tim.colors())
mtext("cross-region minimum of z-scores", side=3, line=0, font=2)
axis(1, at=seq(0,1, length.out=length(xl)), labels=xl)
axis(2, at=seq(0,1, length.out=length(yl)), labels=yl)
mtext("Longitude-latitude bin size (degrees)", side=1, line=2)
mtext("Depth bin size (meters)", side=2, line=2)
fields::image.plot(min_sbs, axes=FALSE, legend.only=TRUE, graphics.reset=TRUE)
Bin Size How Bin size (lon, lat, depth) affects the number of sites in a region. To compare across regions, we can first z-score each region, then take the cross-region average for each bin size combination.
sumry_counts <- data_all[reg!="wcann",
j = {
yi <- table(diff(sort(unique(year))))
yi_form <- paste0(names(yi), paste("(",yi,")",sep=""))
data.table(
"N" = trawlData::lu(year),
"Years" = paste(min(year),max(year),sep=" - "),
"Frequency" = paste(yi_form,collapse=", "),
"Sites" = trawlData::lu(stratum),
# .SD[,list("max. Tows"=max(trawlData::lu(haulid)), "Average Tows" = mean(trawlData::lu(haulid))),by=c("stratum","year")]
"Tows" = paste0(
.SD[,trawlData::lu(haulid),by=c("stratum","year")][,max(V1)],
paste0("(",.SD[,trawlData::lu(haulid),by=c("stratum","year")][,round(mean(V1),1)],")")
),
# "Max. Tows" = .SD[,trawlData::lu(haulid),by=c("stratum","year")][,max(V1)],
# "Avg Tows" = .SD[,trawlData::lu(haulid),by=c("stratum","year")][,mean(V1)],
"Species" = trawlData::lu(spp)
)
},
by=c('reg')
]
setnames(sumry_counts, 'reg', "Region")
sumry_counts[,Region:=factor(Region, levels=c("ebs","ai","goa","wctri","gmex","sa","neus","shelf","newf"), labels=c("E. Bering Sea","Aleutian Islands","Gulf of Alaska","West Coast US","Gulf of Mexico","Southeast US", "Northeast US", "Scotian Shelf","Newfoundland"))]
setkey(sumry_counts, Region)
sumry_counts[,Org:=c("AFSC","AFSC","AFSC","NMFS","GSMFC","SCDNR","NEFSC","DFO","DFO")]
# sumry_counts[,"Avg Tows":=round(eval(s2c("Avg Tows"))[[1]],1)]
stargazer(sumry_counts, summary=FALSE, rownames=FALSE, column.sep.width="2pt", digits.extra=0, digits=NA)
% Table created by stargazer v.5.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu % Date and time: Tue, Dec 06, 2016 - 15:36:08
# \begin{table}[!htbp] \centering
# % \caption{}
# % \label{}
# \begin{tabular}{@{\extracolsep{2pt}} cccccccc}
# \\[-1.8ex]\hline
# \hline \\[-1.8ex]
# Region & N & Years & Frequency & Sites & Tows & Species & Org \\
# \hline \\[-1.8ex]
# E. Bering Sea & $31$ & 1984 - 2014 & 1(30) & $138$ & 4(1.6) & $110$ & $\text{AFSC}^{*}$ \\
# Aleutian Islands & $12$ & 1983 - 2014 & 2(5), 3(4), 4(1), 5(1) & $82$ & 12(2.8) & $55$ & $\text{AFSC}^{*}$ \\
# Gulf of Alaska & $13$ & 1984 - 2013 & 2(7), 3(5) & $89$ & 15(4.4) & $98$ & $\text{AFSC}^{*}$ \\
# West Coast US & $10$ & 1977 - 2004 & 3(9) & $84$ & 91(4.7) & $92$ & $\text{NMFS}^{\dagger}$ \\
# Gulf of Mexico & $17$ & 1984 - 2000 & 1(16) & $39$ & 39(5.7) & $144$ & $\text{GSMFC}^{\ddagger}$ \\
# Southeast US & $25$ & 1990 - 2014 & 1(24) & $24$ & 13(3.9) & $104$ & $\text{SCDNR}^{\S}$ \\
# Northeast US & $32$ & 1982 - 2013 & 1(31) & $100$ & 10(3) & $141$ & $\text{NEFSC}^{\P}$ \\
# Scotian Shelf & $41$ & 1970 - 2010 & 1(40) & $48$ & 11(2.5) & $48$ &$\text{DFO}^{\nabla}$ \\
# Newfoundland & $16$ & 1996 - 2011 & 1(15) & $191$ & 9(2.2) & $72$ & $\text{DFO}^{\nabla}$ \\
# \hline \\[-1.8ex]
# \end{tabular}
# \end{table}
# knitr::kable(sumry_counts, caption="Years = Total number of years sampled, Year Range = the minimum and maximum of years sampled, Year Interval = the number of years elapsed between samples (and the frequency of this interval in parentheses), Sites = the total number of sites in the region, Max. Tows = maximum number of tows per site per year, Average Tows = the average number of tows per site per year, Total Species = the total number of species observed in the region across all sites and years.")
reg_depthStratum <- c(
"ebs" = 500,
"ai" = 100,
"goa" = 500,
"wctri" = 100,
"wcann" = 500,
"gmex" = 500,
"sa" = 500,
"neus" = 500,
"shelf" = 500,
"newf" = 500
)
reg_tolFraction <- c(
"ebs" = 0,
"ai" = 0.15,
"goa" = 0,
"wctri" = 0.15,
"wcann" = 0.15,
"gmex" = 0.15,
"sa" = 0.15,
"neus" = 0.15,
"shelf" = 0.15,
"newf" = 0.15
)
regs <- names(reg_tolFraction)
yr_subs <- list(
ebs = bquote((year)>1983),
ai = NA,
goa = NA,
gmex = bquote((year)>1983 & (year)<2001),
neus = bquote((year)>1981 & (year)<2014),
newf = bquote((year)>1995),
sa = bquote((year)>=1990),
shelf = bquote((year)!=2011),
wctri = NA
)
yr_ablin <- list(
ebs = 1983,
ai = NA,
goa = NA,
gmex = c(1983,2001),
neus = c(1981,2014),
newf = 1995,
sa = 1989,
shelf = 2011,
wctri = NA
)
yregs <- names(yr_subs)
par(mfrow=c(3, 3), mar=c(2,2,1,0.25), cex=1, mgp=c(1,0.15,0), tcl=-0.15, ps=8)
for(r in 1:length(yregs)){
b <- trim_msom(yregs[r], gridSize=0.5, grid_stratum=TRUE, depthStratum=reg_depthStratum[yregs[r]], tolFraction=0.5, plot=FALSE, cull_show_up=FALSE, trimYears=FALSE)
b_tbl <- b[,table(year,stratum)>0]
b_tbl <- b_tbl[,order(colSums(b_tbl))]
image(b_tbl, axes=FALSE)
at_vals <- seq(0,1,length.out=nrow(b_tbl))
axis(1, at=at_vals, labels=rownames(b_tbl))
abline(v=at_vals[rownames(b_tbl)%in%yr_ablin[[r]]])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
}
Trimming Years b/c of Strata, Counts of Analyzed and Excluded Strata Along the vertical axis is a stratum ID. Along the horizontal axis is a year of sampling. The colors are binary: the light color indicates that the stratum was sampled in that year, and the red color indicates that the stratum was not sampled. The vertical line indicates that years at the line or to the left (if the line is on the left half of graph) or to the right (if line is on right half of graph) were excluded from the analysis, inclusively. E.g., for E. Bering Sea (ebs), 1982 and 1983 were excluded. Three regions had no years excluded (ai = Aleutian Islands, goa = Gulf of Alaska, wctri West Coast US). The last three regions had years excluded because of changes in the number or location of strata sampled: gmex = Gulf of Mexico 1983 < years < 2001; newf = Newfoundland 1995 < years; sa = Southeast US 1989 < years. Three regions had years excluded for reasons other than number of strata sampled: ebs = E. Bering Sea years < 1984, years excluded due to massive early increase in number of species sampled each year, change in identification suspected; neus = Northeast US 1981 < years < 2014; shelf = Scotian Shelf years < 2011, bottom temperature not available in final year (used as covariate in occupancy model). Finally, note that in GoA strata had to be present in 100% of years or else they were not included; this was done b/c in 2001 the western-most stratum that was sampled was much further to the east than in other years. Similarly, in E. Bering Sea strata had to be present in all years, otherwise the Northeastern extent of the sampled range was reduced substantially in several years.
par(mfrow=c(9, 5), mar=c(2,2,1,0.25), cex=1, mgp=c(1,0.15,0), tcl=-0.15, ps=8)
for(r in 1:length(yregs)){
b <- trim_msom(yregs[r], gridSize=0.5, grid_stratum=TRUE, depthStratum=reg_depthStratum[yregs[r]], tolFraction=0.15, plot=FALSE, cull_show_up=FALSE, trimYears=FALSE)
plot(b[,list("# strata sampled"=trawlData::lu(stratum)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
plot(b[,list("min latitude"=min(lat)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
plot(b[,list("max latitude"=max(lat)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
plot(b[,list("min longitude"=min(lon)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
plot(b[,list("max longitude"=max(lon)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
}
Trimming Years b/c of Stratum Locations, Stats for Analyzed and Excluded Strata The number and coordinate extreme of strata in each region if all years had been included in the analysis, and if strata were only removed if they were absent in more than 15% of years. Note that the 15% tolerance rule used here can differ from the implemented rule in two ways: 1) for regions with years excluded, the increased number of years may change whether or not a stratum meets the 15% cutoff; 2) two of the regions (E Bering Sea and Gulf of Alaska) had a 0% tolerance, therefore the same strata were sampled in each year for these regions (though the changes in coordinate extrema and stratum count illustrate the need for the unique tolerance rule in these regions).
par(mfrow=c(9, 5), mar=c(2,2,1,0.25), cex=1, mgp=c(1,0.15,0), tcl=-0.15, ps=8)
for(r in 1:length(yregs)){
# b <- trim_msom(yregs[r], gridSize=0.5, depthStratum=reg_depthStratum[yregs[r]], tolFraction=reg_tolFraction[yregs[r]], grid_stratum=TRUE, plot=FALSE, cull_show_up=FALSE)
b <- data_all[reg==yregs[r]]
plot(b[,list("# strata sampled"=trawlData::lu(stratum)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
plot(b[,list("min latitude"=min(lat)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
plot(b[,list("max latitude"=max(lat)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
plot(b[,list("min longitude"=min(lon)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
plot(b[,list("max longitude"=max(lon)),by=c("year")])
mtext(yregs[r], side=3, line=0, adj=0, font=2)
abline(v=yr_ablin[[r]])
}
Trimming Years b/c of Stratum Locations, Stats for the Analyzed Strata Changes in stratum statistics (number of strata, min/max lon/lat) for strata included in results in paper.